Fooling End-to-end Speaker Verification by Adversarial Examples
نویسندگان
چکیده
Automatic speaker verification systems are increasingly used as the primary means to authenticate costumers. Recently, it has been proposed to train speaker verification systems using end-to-end deep neural models. In this paper, we show that such systems are vulnerable to adversarial example attacks. Adversarial examples are generated by adding a peculiar noise to original speaker examples, in such a way that they are almost indistinguishable, by a human listener. Yet, the generated waveforms, which sound as speaker A can be used to fool such a system by claiming as if the waveforms were uttered by speaker B. We present white-box attacks on a deep end-to-end network that was either trained on YOHO or NTIMIT. We also present two black-box attacks. In the first one, we generate adversarial examples with a system trained on NTIMIT and perform the attack on a system that trained on YOHO. In the second one, we generate the adversarial examples with a system trained using Mel-spectrum features and perform the attack on a system trained using MFCCs. Our results show that one can significantly decrease the accuracy of a target system even when the adversarial examples are generated with different system potentially using different features.
منابع مشابه
Crafting Adversarial Examples For Speech Paralinguistics Applications
Computational paralinguistic analysis is increasingly being used in a wide range of applications, including securitysensitive applications such as speaker verification, deceptive speech detection, and medical diagnostics. While state-ofthe-art machine learning techniques, such as deep neural networks, can provide robust and accurate speech analysis, they are susceptible to adversarial attacks. ...
متن کاملEnd-to-end Adversarial Learning for Generative Conversational Agents
This paper presents a new adversarial learning method for generative conversational agents (GCA) besides a new model of GCA. Similar to previous works on adversarial learning for dialogue generation, our method assumes the GCA as a generator that aims at fooling a discriminator that labels dialogues as human-generated or machine-generated; however, in our approach, the discriminator performs to...
متن کاملImproving Speaker-Independent Lipreading with Domain-Adversarial Training
We present a Lipreading system, i.e. a speech recognition system using only visual features, which uses domain-adversarial training for speaker independence. Domain-adversarial training is integrated into the optimization of a lipreader based on a stack of feedforward and LSTM (Long Short-Term Memory) recurrent neural networks, yielding an end-to-end trainable system which only requires a very ...
متن کاملSafety Verification of Deep Neural Networks
Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their saf...
متن کاملGeneralized End-to-End Loss for Speaker Verification
In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss, which makes the training of speaker verification models more efficient than our previous tuple-based endto-end (TE2E) loss function. Unlike TE2E, the GE2E loss function updates the network in a way that emphasizes examples that are difficult to verify at each step of the training process. Additionally, the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1801.03339 شماره
صفحات -
تاریخ انتشار 2018